Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Iterator for Bound<'py, PySequence> #3927

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

LilyFoote
Copy link
Contributor

When working on #3923 I noticed that we didn't have an implementation of Iterator for PySequence. I'm not 100% certain this is the right thing to add, since it hides the fallibility of len and get_item with .expect calls. But a properly implemented Python Sequence shouldn't raise in these, so maybe the ergonomics justify allowing panics here?

If we want this, I think now would be a good time to add it, since the changes to error handling it introduces would be straightforward as part of the general Bound migration.

@davidhewitt
Copy link
Member

Thanks for proposing this!

I think it's definitely an open design question. There's some possibility already for the set and frozenset iterators to panic in the event of a poorly-implented subclass. That said, I think poorly-implemented sequences seems more likely. Is that a problem? I'm not sure.

I'd be interested to know what was the original motivation. Was it completeness? Do you use this one downstream? Is this noticeably faster than going via Bound<PyIterator>?

While it's true that this could be a good time to add it, I'm also tempted to argue the opposite, and that (where possible) we should be aiming to keep the differences between the Bound and GIL Ref APIs as small as possible.

I think adding this in a future PyO3 would be breaking but might not be that hard to fixup in isolation, so I'm not super worried if this doesn't make it in 0.21.

@LilyFoote
Copy link
Contributor Author

I think poorly-implemented sequences seems more likely. Is that a problem? I'm not sure.

Yeah, me neither.

I'd be interested to know what was the original motivation. Was it completeness?

Pretty much this - I was looking through the pyo3 codebase for any uses of .iter() in a for loop to see if they could be simplified.

Do you use this one downstream?

No.

Is this noticeably faster than going via Bound?

I haven't tested this.

While it's true that this could be a good time to add it, I'm also tempted to argue the opposite, and that (where possible) we should be aiming to keep the differences between the Bound and GIL Ref APIs as small as possible.

Fair.

I think adding this in a future PyO3 would be breaking but might not be that hard to fixup in isolation, so I'm not super worried if this doesn't make it in 0.21.

Avoiding the need for a breaking change was the main motivation for suggesting it for 0.21.

@davidhewitt
Copy link
Member

davidhewitt commented Mar 6, 2024

I've been thinking about this more. pydantic-core does iterate sequences in a few places, we could upgrade very straightforwardly to use infallible iteration. But I'm still uneasy that infallible iteration is correct; panics tend to upset Pydantic users.

Maybe the right way forward here is to just make the iteration fallible? It would at least avoid the need for the .iter() call which you noticed. That would remove my main concern tbh.

A related question would be whether any other types deserve iterators too. PyString, PyBytes, PyMapping?

This allows using a `Bound<'py, PySequence>` directly in a for loop.
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the long pause here; I'd missed that you pushed the fallible iteration.

In light of #4553, this might now have more space in the API as a clearly separate method from .try_iter().

That said, I'm a little uneasy about the move away from Python iteration here, and the subtleties of the breaking change. e.g. what happens if the sequence implements __iter__ and does something a bit... exotic?

Perhaps let's revisit in 0.24 when users should have migrated off .iter(), so it's less likely to be an in-place breaking change.

Comment on lines +493 to +495
unsafe fn get_item(&self, index: usize) -> PyResult<Bound<'py, PyAny>> {
self.sequence.get_item(index)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unsafe fn get_item(&self, index: usize) -> PyResult<Bound<'py, PyAny>> {
self.sequence.get_item(index)
}
fn get_item(&self, index: usize) -> PyResult<Bound<'py, PyAny>> {
self.sequence.get_item(index)
}

@davidhewitt davidhewitt added this to the 0.24 milestone Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants